Back

G3: Genes, Genomes, Genetics

Oxford University Press (OUP)

Preprints posted in the last 30 days, ranked by how well they match G3: Genes, Genomes, Genetics's content profile, based on 222 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Telomeric amplicons of SUL1 and Y' in yeast are generated by microhomology-mediated break induced replication occurring in cis

Brewer, B. J.; Martin, R.; Ramage, E.; Payen, C.; Di Rienzi, S. C.; Zhao, Y.; Zane, K.; Verhey, J.; Galey, M.; Miller, D. E.; Ong, G. T.; McKee, J. L.; Alvino, G. M.; Dunham, M. J.; Raghuraman, M. K.

2026-04-09 genetics 10.64898/2026.04.07.716220 medRxiv
Top 0.1%
4.8%
Show abstract

Gene amplification is a potent driver of evolution and is thought to contribute to genetic diseases, including cancer. The yeast Saccharomyces cerevisiae is a powerful organism for understanding amplification mechanisms. When yeast is grown long term in sulfate-limiting chemostats, amplification of the gene that encodes the primary sulfate transporter, SUL1, is a common outcome. Here we describe a form of SUL1 amplification in which multiple copies of the right terminal region of chromosome II are appended in tandem to a native telomere. We find this form of amplicon when we delete the origin of replication next to SUL1 or delete a variety of genes involved in DNA metabolism. It is the only form of amplification found in a yku70{Delta} mutant suggesting that unprotected telomeres are involved. We propose that these terminal addition events occur when the unprotected 3 G1-3T telomeric sequence invades a short ([~]7 bp) internal telomere sequence (ITS) to begin a form of microhomology-mediated break-induced replication (mmBIR) that has been documented in type-I survivors of telomerase mutants. In addition to amplification of the right end of chromosome II we also find that telomeres containing the sub-telomeric repeat Y experience similar tandem amplification events and show that their formation is reduced in a pol32{Delta} mutant, a gene required for mmBIR. Within individual amplicons the ITSs and Ys are nearly identical, suggesting that the multiple copies of the amplified region are generated in a single mmBIR event that we describe as pseudo-rolling circle mmBIR. A similar amplification event at the P-telomere of human chromosome 18 has four copies of a [~]54 kb region separated by ITSs of nearly identical size. This finding suggests that these additional copies of the terminal fragment of human chromosome 18 arose by the same pseudo-rolling circle mechanism, perhaps during a period of telomeric stress. AUTHOR SUMMARYThe human genome is peppered with duplicates (or higher numbers) of segments that are located at sites both nearby and distant from the original, ancestral segments. These Copy Number Variants, or CNVs, appear to be highly variable among different individuals and are being examined with great interest as potential loci associated with genetic disease. Experimentally determining how these CNVs arise and become distributed across the genome is nearly impossible using humans. We are using budding yeast as the model organism to explore mechanisms of gene amplification. In this work we show that by destabilizing the ends of yeast chromosomes (telomeres) or by interfering with genes involved in the replication, repair, or recombination of DNA results in a specific form of segmental copy number increase that is initiated at telomeres. We propose that a telomere invades an internal chromosome site and sets up a pseudo-circular template for conservative DNA replication. The outcome is a chromosome with multiple, identical copies of a chromosome end arranged in tandem. We believe that it is also a major mechanism used by cells to repair telomeres that have become eroded during aging.

2
Climate cycles drive demographic history and genomic divergence in cactus wrens (Campylorhynchus brunneicapillus) across North American warm deserts

Rodriguez-Rojas, P. C.; Oceguera-Figueroa, A. F.; Navarro-Siguenza, A. G.; Vazquez Miranda, H.

2026-03-26 evolutionary biology 10.64898/2026.03.24.714001 medRxiv
Top 0.1%
3.7%
Show abstract

Text AbstractIn this study, we characterized the genetic structure and reconstructed the demographic history of cactus wrens (Campylorhynchus brunneicapillus), an endemic species of desert regions of North America, that shows a clear phenotypic and genotypic variation. We evaluated the effects of historical climate change on the structure and population dynamics of desert species using genomic data through genotyping by sequencing (GBS) and applied a population structure analysis (FST and ADMIXTURE), revealing two genetically differentiated groups: one continental and another peninsular in Baja California. Subsequently, we implemented the MSMC2 coalescent model on data divided into autosomal regions and the Z sex chromosome to estimate changes in effective population size (Ne) through evolutionary time. Additionally, we developed ecological niche models (ENMs) projected to the Last Glacial Maximum (LGM), Last Interglacial (LIG), Present times, and Future (2060 - 2080). Results indicate that both populations maintained moderated Nes before the LGM, experienced severe bottlenecks (Ne [~] 102-103), followed by a sustained expansion. However, recovery was limited to the Z chromosome of the peninsular population. These findings reveal how glaciations and interglacials shaped the evolutionary history of desert species and provide genomic evidence of the splitting of C. affinis from C. brunneicapillus. Article summaryThis research examines how climate changes shaped genetic diversity of cactus wrens across North American warm deserts. Using coalescent methods, researchers tracked effective population size changes over 100,000 years, using ecological niche modeling they predicted habitat suitability across climate periods. Results showed that continental and peninsular populations experienced bottlenecks during the Last Glacial Maximum, followed by demographic recovery on warm periods. However, the sex chromosome (Z) revealed male-biased demographic patterns in peninsular populations. Future projections indicated habitat suitability reductions for peninsular populations, highlighting conservation concerns. These findings demonstrate that past climate shaped genetic diversity of cactus wrens.

3
Epistatic fitness landscapes emerge from parallel adaptive walks in breeding network metapopulations

Monyak, T.; Morris, G.

2026-03-20 genetics 10.64898/2026.03.18.712732 medRxiv
Top 0.2%
3.1%
Show abstract

Global networks of crop breeding programs leverage diverse germplasm, but diversity increases the complexity of maintaining stability in their elite genepools. To characterize genetic heterogeneity in breeding metapopulations and develop insights on how to manage it, we simulated the evolution of breeding populations on fitness landscapes. We revealed the geometric decrease in the average effect size of alleles segregating as standing variation that become fixed along an adaptive walk. We also demonstrated how independent adaptive walks of subpopulations are influenced by genetic drift, leading to cryptic genetic heterogeneity among elite genepools. This variation is released when elite lines derived from independent subpopulations are crossed, leading to segregation for 2-4X more major QTL in admixed families as in unadmixed families, and 2-4X more epistatic interactions. The emergent property of fitness epistasis for traits under stabilizing selection is well-understood in evolutionary genetics, but under-appreciated in crop quantitative genetics. To highlight the importance of this phenomenon, we constructed an empirical genotype-to-fitness landscape from the sorghum NAM, a global admixed prebreeding resource, demonstrating the utility of fitness landscapes for inferring genetic compatibilities within metapopulations. Our findings suggest that in breeding networks, strategies for effective germplasm exchange must account for epistasis in the oligogenic component of the genetic architecture of locally-adapted traits. Article summaryModern public sector crop improvement happens in networks of breeding programs that routinely exchange genetic information. Traditional models for understanding quantitative traits have limited predictiveness in situations with such genetic heterogeneity. This study uses breeding simulations and empirical data to show the utility of the fitness landscape framework for characterizing the genetic architecture of complex traits in breeding metapopulations. By simulating the evolution of breeding programs and integration into networks, it demonstrates how epistatic interactions between large-effect alleles are a fundamental property that must be accounted for when exchanging germplasm. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=102 SRC="FIGDIR/small/712732v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1541326org.highwire.dtl.DTLVardef@b553a8org.highwire.dtl.DTLVardef@8758b4org.highwire.dtl.DTLVardef@1d0bdcd_HPS_FORMAT_FIGEXP M_FIG C_FIG

4
Track Hub Quickload Translator: Convert Track Hub or Quickload data for viewing in the UCSC Genome Browser or the Integrated Genome Browser

Freese, N. H.; Raveendran, K.; Sirigineedi, J. S.; Chinta, U. L.; Badzuh, P.; Marne, O.; Shetty, C.; Naylor, I.; Jagarapu, S.; Loraine, A.

2026-03-30 bioinformatics 10.64898/2026.03.26.708838 medRxiv
Top 0.2%
2.9%
Show abstract

SummaryTrack Hub Quickload Translator is a web application that interconverts University of California Santa Cruz (UCSC) Genome Browser track hub and Integrated Genome Browser (IGB) data repository formats by translating the track hub or Quickload configuration files to the other genome browsers required format. This new work enables researchers to work with tens of thousands of published genome assemblies for the first time using either browser. Availability and ImplementationTrack Hub Quickload Translator is implemented using Python 3 and freely available to use at translate.bioviz.org. Integrated Genome Browser is available from BioViz.org. Track Hub Quickload Translator, GenArk Genomes, and the Integrated Genome Browser source code is available from github.org/lorainelab. Contactaloraine@charlotte.edu

5
Heparan sulfate is essential for Drosophila FGF export

Barbosa, G. O.; Solis-Calero, C.; Kornberg, T.

2026-03-26 developmental biology 10.64898/2026.03.24.714045 medRxiv
Top 0.2%
2.8%
Show abstract

Binding of Fibroblast growth factor (FGF) to a heparan sulfate proteoglycan (HSPG) is required for paracrine FGF signaling. To improve our understanding of FGF:HSPG association, we developed a method to monitor export of the Drosophila FGF ortholog Branchless (Bnl) in vivo. We detected Bnl on the surface of approximately 10% of Bnl-producing cells, but Bnl on the surface of cells depleted of HS was much reduced. HS depletion also non-autonomously decreased the activity of cytonemes that extend from cells that receive Bnl. These results are consistent with the idea that Bnl export to the cell surface is regulated, that intracellular binding of an HSPG to Bnl in producing cells is essential for export, and that cells that take up Bnl actively participate in its release from producing cells. SummaryLevels of FGF exported to the surface of FGF-expressing cells are dependent on intracellular heparan sulfate proteoglycans.

6
A reference genome assembly for Quercus canariensis Willd

Couturier, F.; Cravero, C.; Lesur, I.; Confais, J.; Belmonte, E.; Piat, L.; Marande, W.; Rellstab, C.; Valbuena, M.; Saez-Laguna, E.; Duvaux, L.

2026-04-01 genetics 10.64898/2026.03.31.714748 medRxiv
Top 0.2%
2.8%
Show abstract

We present a genome assembly from a specimen of Quercus canariensis (Fagaceae; Fagales; Magnoliopsida). The assembly was generated using PacBio HiFi long reads with an approximate sequencing depth of 39X and scaffolded using a reference-guided approach. The genome sequence has a total length of 816.0 megabases for haplotype 1 and 804.8 megabases for haplotype 2. The two haplotypes are each resolved into 12 chromosomal pseudomolecules, with only 3.48% and 1.36% of sequences remaining unplaced in haplotypes 1 and 2, respectively. Assembly completeness is supported by BUSCO scores of 98.3% and 98.2% complete genes for haplotypes 1 and 2, respectively. Structural annotation identified 51,882 and 46,482 protein-coding genes in haplotypes 1 and 2, respectively. This genome assembly provides the first chromosome-scale reference genome for Q. canariensis, laying the base for future genomic and evolutionary studies in this understudied species of the hybridizing white oak species complex. TaxonomyLineage cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae; rosids; fabids; Fagales; Fagaceae; Quercus EBI:txid568684 Quercus canariensis Willd. 1809 (Willdenow)

7
A stable genomic variant for photoperiodic flowering plasticity to enhance grain mold escape and yield stability in sorghum

Hodehou, D. A. T.; Diatta, C.; Bodian, S.; Ndour, M.; Sambakhe, D.; Sine, B.; Felderhoff, T.; Diouf, D.; Morris, G. P.; Kane, N. A.; Faye, J. M.

2026-04-04 genetics 10.64898/2026.04.01.715939 medRxiv
Top 0.2%
2.7%
Show abstract

Grain mold severely constrains sorghum [Sorghum bicolor (L.) Moench] productivity and grain quality in subhumid environments. Photoperiod-sensitive flowering plays a key role in mold avoidance and yield stability along north-south rainfall gradients. In response to the high susceptibility of elite cultivars in subhumid zones of Senegal, we developed and characterized a recombinant inbred line (RIL) population derived from Nganda (grain mold-susceptible) and Grinkan (photoperiod-sensitive) varieties. The population was evaluated across three distinct agro-ecological zones over two years. Environmental indices derived from genotype-environmental interactions, together with defined growth windows, strongly influenced flag leaf appearance (FLA), a photoperiodic flowering trait. Plasticity parameters (intercept and slope) for environmental indices, FLA, grain mold severity, and yield enabled identification of loci contributing to flowering response, mold resistance, and yield stability. The maturity gene Ma1 and two QTLs for FLA, qFLA6.2 and qFLA6.3, were identified, stable across environments, and colocalized with grain mold and yield QTLs. The wild-type Ma1 allele from Grinkan delayed FLA and reduced grain mold damage but was not associated with increased yield. The Ma1 effect was confirmed using the developed breeder-friendly KASP marker, Sbv3.1_06_40312464K, in 174 F3 three-way cross families. Photoperiod-sensitive lines with intermediate-to-late FLA alleles showed strong negative associations with mold damage. Overall, the identified stable loci and candidate lines provide foundations for effective molecular breeding of climate-resilient varieties. PLAIN LANGUAGE SUMMARYGrain mold is a fungal disease that reduces sorghum grain yield and quality, particularly in subhumid climates. With the limited number of resistant elite varieties, photoperiod-sensitive flowering to day length variation can contribute to grain mold escape at the end of rainy seasons. We characterized 286 sorghum recombinant inbred lines across three contrasting environments over two years along rainfall gradients in Senegal. Using flag leaf appearance (FLA), which is a photoperiodic flowering trait, strong genotype-environment interactions for FLA and genotypic plasticity were revealed. We identified and validated the common genomic locus associated with FLA variation and its plasticity across environments, the canonical maturity gene Ma1, which was influenced by temperature variation across environments. The presence of Ma1 in the background of photoperiod-sensitive lines enhances grain mold avoidance and yield stability along rainfall gradients in Senegal. CORE IDEASO_LIWe investigated photoperiodic flowering plasticity in sorghum as a contributor to grain mold resistance and yield stability along rainfall gradients. C_LIO_LIThe Maturity locus Ma1 (qFLA6.1) is the major contributor of photoperiodic flowering and its plasticity across semi-arid and subhumid environments. C_LIO_LIHybrid genotypes carrying two stable loci qFLA6.1 and qFLA6.2 sustain high grain mold avoidance in diverse environments. C_LIO_LIPhotoperiod-sensitive lines with medium to late flowering times are effective in avoiding grain mold, while maintaining yield stability in subhumid regions. C_LI

8
Heterologous expression of the human cohesin complex in Saccharomyces cerevisiae results in a dominant-negative phenotype

Stephens, E.; Hamza, A.; Driessen, M. R. M.; O'Neil, N. J.; Stirling, P. C.; Hieter, P.

2026-04-07 genetics 10.64898/2026.04.03.716359 medRxiv
Top 0.2%
2.6%
Show abstract

The cohesin complex has conserved roles in sister chromatid cohesion, DNA replication, genome organization, and the DNA damage response. We heterologously expressed the human cohesin complex in yeast to probe the behaviour of human cohesin. Human cohesin was unable to complement loss of function mutations in yeast cohesin, either as single subunits or as complexes, including in the context of co-expressing up to 12 human cohesin-associated genes. Heterologous expression of human cohesin in yeast expressing wildtype yeast cohesin resulted in dominant cohesion dysregulation and DNA damage sensitivity phenotypes. We used co-immunoprecipitation to demonstrate that human SMC proteins interact with endogenous yeast cohesin rings creating dominant-negative hybrid complexes that disrupt endogenous cohesin biology.

9
Humanization of the rpb9 locus in fission yeast reveals conserved and divergent roles of rpb9 and human POLR2I

Finkel, J. M.; Williams, M. G.; Nirmal, M. B.; Pandey, S.; Howe, E. D.; Liu, C. T.; Lohman, J. R.; Sharma, N.; Vo, T. V.

2026-04-04 synthetic biology 10.64898/2026.04.02.716003 medRxiv
Top 0.3%
1.9%
Show abstract

Background/ObjectivesRNA polymerase II is a multifunctional complex that is critical for gene regulation and environmental responses. Its POLR2I subunit in human is associated with various pathologies, including cancer chemoresistance. However, much of our understanding of how POLR2I could function indirectly derives from studies of its homologs in yeasts called Rpb9. Here, we endogenously humanized the rpb9 gene of the fission yeast Schizosaccharomyces pombe to examine the functional capabilities of POLR2I. MethodsWe edited the genomic rpb9 locus in S. pombe so that it encodes the human POLR2I protein, and investigated functional and structural conservation. ResultsWith our humanized yeast system, we find widespread functional complementation by human POLR2I of S. pombe rpb9 roles in yeast growth, chronological aging, and stress responses. We also find that POLR2I complements novel roles for yeast rpb9 in facultative heterochromatin assembly, resistance against the chemotherapy 5-fluorouracil, and resistance against the fungicide thiabendazole. In contrast, we find that POLR2I cannot complement the role of rpb9 in resistance against the transcription elongation inhibitor 6-azauracil (6-AU) in our system. Interestingly, POLR2I could complement 6-AU resistance if ectopically expressed. Lastly, we observe extensive structural homology between Rpb9 and POLR2I proteins. ConclusionsOur study establishes an endogenous cross-species gene complementation strategy that uncovers both conserved and rewired functions of fission yeast rpb9 and its human homolog, POLR2I. In addition to validating conserved roles, we also identified conservation of previously unrecognized roles of rpb9 in heterochromatin formation and chemoresistance.

10
Drosophila pseudoobscura third chromosome inversion arrangements have sex-specific effects on life history traits

Reyes Castellon, G. A.; Aimadeddine, G.; Chiao, C. R.; Guruprasad, S.; Halbert, P. E.; Hassan, S. A.; Luong, M. Q.; Mailanperuma Arachchillage, K. S.; Martinez, Y.; Mukhtarov, M.; Nair, G.; Nguyen, E. N.; Onochie, C. L.; Patel, O.; Than, J. T.; Manat, Y.; IISAGE, ; Meisel, R. P.

2026-04-08 evolutionary biology 10.64898/2026.04.06.716560 medRxiv
Top 0.3%
1.9%
Show abstract

Life history traits are often correlated, creating trade-offs that may impede the response to natural selection and be responsible for the evolution of senescence. These trade-offs may arise through pleiotropic effects, which can affect the response to selection in ways that resemble intra-locus sexual antagonism. Despite these hypothesized relationships, we lack clear connections between pleiotropy, sexual antagonism, and the evolution of life histories. Empirical tests for inter-sexual differences in life-history traits, including sex-specific aging, can be used to evaluate hypotheses about how pleiotropy and sexual conflict affect evolutionary trade-offs. To those ends, we measured lifespan, development time, and body size in Drosophila pseudoobscura males and females, each of which carried one of six third chromosome inversion genotypes. Temperature affected lifespan and development more than any other factor; higher temperatures increased mortality rate, decreased lifespan, and accelerated development. However, we also observed sex differences in mortality rates and development times that depended on genotype and temperature. Notably, temperature elevated the initial mortality rate across all flies, yet increasing temperatures reduced the rate of aging in some genotype-sex combinations. Similarly, direct effects of genotype on mortality rate and development time depended greatly on sex and temperature, but there was no genotype effect on body size. Despite these context-dependent genotype effects on life history traits, we failed to identify any correlations that would serve as clear evidence for sexual conflict or trade-offs. Our results therefore suggest that either historical conflicts have been resolved or any conflicts that may exist do not result in the correlations predicted by existing models.

11
Benchmarking SNP-Calling Accuracy Against Known Citrus Pedigrees Reveals Pangenome Advantages Over Linear References

Kuster, R. D.; Sisler, P.; Sandhu, K.; Yin, L.; Niece, S.; Krueger, R.; Dardick, C.; Keremane, M.; Ramadugu, C.; Staton, M. E.

2026-04-09 genomics 10.64898/2026.04.07.716967 medRxiv
Top 0.5%
1.5%
Show abstract

BackgroundPangenomes are a promising new approach to genomics that can reduce reference bias in genotyping, but the reliability of such a data model remains unclear in tracking variation across species. To test the utility of graph-based pangenomes for interspecific breeding, we developed a Minigraph-Cactus super-pangenome representing four Citrus species derived from the founder lines of a citrus breeding program. To benchmark SNP calling accuracy using graph and linear-based approaches, we performed whole genome short read sequencing for two sets of pedigreed progeny: 30 F1 hybrids and 244 advanced hybrids from an F1 crossed with a parent not included in the pangenome. ResultsThe linear approach yielded more SNP calls than the graph-based approach, however, both methods exhibited similar Mendelian Inheritance Error Rates (MIER) in a tool-dependent manner. Reconstruction of parental haplotype blocks in the advanced hybrids revealed a striking improvement in performance in the pangenome graph-based calls, suggesting MIER is vulnerable to error when reference bias influences both parental and progeny genotype calls. Masking of regions diverged from the reference path improved MIER accuracy metrics and haplotype block reconstruction in both the linear and graph-based SNP calls. ConclusionsIn non-model systems, inheritance patterns observed from pedigreed hybrids provide a framework for benchmarking variant-calling accuracy using pangenomes. SNP miscalls originating from diverged regions can falsely satisfy MIER filters, thus we recommend haplotype blocks. The inherent structure of the pangenome graph has promising applications for removing regions of unreliable mapping quality, which cannot otherwise be reliably removed using traditional filtering metrics.

12
Structure of the Arabidopsis receptor kinase SRF6 ectodomain determined from crystals obtained using the LRR crystallisation screen

Caregnato, A.; Hohmann, U.; Hothorn, M.

2026-03-23 plant biology 10.64898/2026.03.20.713188 medRxiv
Top 0.5%
1.4%
Show abstract

Plant-specific membrane receptor kinases with structurally diverse extracellular domains regulate key processes in plant growth, development, immunity and symbiosis. Structural studies of these glycoproteins are often hampered by the limited quantities in which they can be obtained. Here, we describe the LRR crystallization screen, which has enabled the successful crystallization and structure determination of multiple receptor kinase ectodomains, including ligand-and co-receptor-bound complexes. As an example, we report the 1.5 [A] resolution crystal structure of the leucine-rich repeat (LRR) domain of STRUBBELIG-RECEPTOR FAMILY 6 (SRF6) from Arabidopsis thaliana. The SRF6 ectodomain contains seven LRRs and a disulfide-bond-stabilised N-terminal capping domain but lacks the canonical C-terminal cap and the N-glycosylation pattern typically observed in other family members. Previously reported protein-protein interactions between the SRF6 and SRF7 ectodomains and the receptor kinases BRI1, BRL1, BRL3, SERK3 and BIR1-3 could not be confirmed by quantitative isothermal titration calorimetry and grating-coupled interferometry assays, suggesting that these structurally conserved LRR receptor kinases may have signalling functions outside the brassinosteroid pathway. SynopsisA crystallisation screen that has enabled the structural analysis of various extracellular domains of plant membrane receptor kinases is described together.

13
EFN-4/Ephrin converges with SAX-3/Robo, UNC-6/Netrin, and Heparan Sulfate Proteoglycan signaling to control MAB-5/Hox-dependent posterior Q neuroblast migration in Caenorhabditis elegans

Jain, V. D.; Johannesen, A.; Teixeira, F. L.; Lundquist, E. A.

2026-03-31 developmental biology 10.64898/2026.03.27.714887 medRxiv
Top 0.5%
1.3%
Show abstract

Hox genes have been broadly implicated in nervous system development, but the molecular and genetic mechanisms that act downstream of Hox factors remain to be identified. The MAB-5 antennapedia-like Hox transcription factor is both necessary and sufficient to cause posterior migration of the Q neuroblast descendants in Caenorhabditis elegans. In response to MAB-5, the left-side QL descendants QL.a and QL.ap undergo a three-stage migration process, with each stage characterized by a posterior lamellipodial protrusion followed by cell body migration. The QL.ap cell differentiates into the PQR neuron posterior to the anus. Previous studies showed that the MAB-5-regulated gene efn-4/Ephrin was required for the third and final stage of QL.ap migration, with efn-4 mutation resulting in placement of PQR immediately anterior to the anus. This subtle and previously-undescribed phenotype opens the possibility that other known neuronal development genes could be involved. In this work, we screened known signaling mutants for third-stage PQR migration defects. We found that mutations in SAX-3/Robo signaling, UNC-6/Netrin signaling, and heparan sulfate proteoglycans (HSPGs) all displayed third-stage PQR migration defects. The effects in single mutants were weak compared to efn-4, and double mutant analysis revealed lack of genetic synergy, consistent with all of these molecules converging on a common pathway. This genetic analysis is consistent with physical interaction studies in vitro from another group that suggest that these molecules form connected communities of interacting extracellular domains, raising the possibility that they are all components of a large extracellular signaling complex required for posterior QL.ap migration. In this model, we envision that MAB-5/Hox drives EFN-4/Ephrin expression in QL.ap, which then seeds the formation of an extracellular signaling complex containing SAX-3/Robo signaling, UNC-6/Netrin signaling, and HSPGs that drives posterior lamellipodial formation and posterior migration.

14
Transposable element disruption of a second thyroglobulin-like gene confers Vip3Aa resistance in Helicoverpa armigera

Bachler, A.; Walsh, T. K.; Andrews, D.; Williams, M.; Tay, W. T.; Gordon, K. H.; James, B.; Fang, C.; Wang, L.; Wu, Y.; Stone, E. A.; Padovan, A.

2026-04-09 genomics 10.64898/2026.04.06.716841 medRxiv
Top 0.5%
1.3%
Show abstract

BackgroundThe cotton bollworm Helicoverpa armigera is a major global pest controlled by genetically engineered crops expressing Bacillus thuringiensis (Bt) toxins, including Vip3Aa. While Vip3Aa is widely deployed, the genetic basis of resistance remains poorly understood. Previous work identified disruption of a thyroglobulin-like gene (HaVipR1) as one mechanism of resistance, suggesting additional loci may be involved. ResultsUsing linkage analysis, transcriptomics, long-read sequencing, and CRISPR-Cas9 gene editing, we identify a second thyroglobulin-like gene, HaVipR2, as a novel mediator of Vip3Aa resistance. Resistance in a field-derived H. armigera line was shown to be monogenic, recessive, and autosomal, mapping to chromosome 29. Long-read sequencing revealed a [~]16 kb transposable element insertion disrupting HaVipR2, which was undetectable using standard short-read approaches. CRISPR-Cas9 knockout of HaVipR2 conferred >900-fold resistance, confirming its causal role. Comparative analyses show that HaVipR1 and HaVipR2 share conserved domain architecture, indicating that thyroglobulin-domain proteins represent a recurrent target of resistance evolution. ConclusionsOur findings establish thyroglobulin-domain proteins as a new class of Bt resistance genes in Lepidoptera and demonstrate that transposable element insertions can drive adaptive resistance while evading detection by conventional methods. These results highlight the importance of long-read sequencing and accurate genome annotation for resistance monitoring and provide new insights into the molecular basis and evolution of Vip3Aa resistance.

15
Optimizing resource allocation in Miscanthus breeding with sparse testing designs for genomic prediction

Proma, S.; Lubanga, N.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Shaik, A.; Garcia-Abadillo, J.; Jarquin, D.

2026-03-23 genomics 10.64898/2026.03.18.712722 medRxiv
Top 0.6%
1.3%
Show abstract

Phenotyping high-biomass perennial crops is laborious and the rate of genetic gain in perennial crop breeding programs is typically low. So, it is especially important to identify methods that produce efficiency gains in the breeding process. Miscanthus is a C4 perennial grass with favorable characteristics for producing biomass as a feedstock for biofuels and diverse biobased products. Increasing biomass yield will increase profitability and environmental benefits, so is a key target for Miscanthus breeding. In addition, the identification of well-adapted genotypes across a wide range of environmental conditions requires the establishment of multi-environment trials (METs). Sparse testing is a genomic prediction-based strategy that reduces the phenotyping costs in METs by selecting a subset of genotypes to evaluate in a subset of environments and then predicts the performance of the unobserved genotype-environment combinations. A Miscanthus sacchariflorus (MSA) population comprising 336 genotypes observed across three environments was analyzed. Three prediction models considering main effects (environments, genotypes, genomic) and interaction effects (genotype-by-environment; GxE interaction) were implemented for forecasting dry biomass yield (YDY), total culm (TCM), average internode length (AIL), and culm node number (CNN). Multiple calibration sets based on different compositions and sizes were considered to evaluate performance in terms of the predictive ability (PA) and the mean square error (MSE) for a fixed testing set size. The training set size ranged from 52 to 112 to predict a fixed set of 224 unobserved genotypes across all three environments. The results showed that the model accounting for GxE interaction presented the highest PA and the lowest MSE for CNN (PA: [~]0.77, MSE: [~]0.5) and YDY (PA: [~]0.70, MSE: [~]1.3) while for TCM and AIL these ranged from [~]0.28 to 0.41 and [~]1.3 to 4.3, respectively. Overall, varying training sets and allocation strategies did not affect PA and MSE, with 52 non-overlapping and 0 overlapping genotypes per environment as the optimal cost-effective allocation framework. This suggests that implementing sparse testing designs could significantly reduce phenotyping costs by fivefold, without compromising PA in breeding programs for perennial crops such as Miscanthus.

16
Robust Random Forests for Genomic Prediction: Challenges and Remedies

Lourenco, V. M.; Ogutu, J. O.; Piepho, H.-P.

2026-04-01 bioinformatics 10.64898/2026.03.30.715203 medRxiv
Top 0.6%
1.2%
Show abstract

Data contamination--from recording errors to extreme outliers--can compromise statistical models by biasing predictions, inflating prediction errors, and, in severe cases, destabilizing performance in high-dimensional settings. Although contamination can affect responses and covariates, we focus on response contamination and evaluate Random Forests through simulation. Using a synthetic animal-breeding dataset, we assess robust Random Forests across several contamination scenarios and validate them on plant and animal datasets. We thereby clarify the consequences of contamination for prediction, develop a robust Random Forest framework, and evaluate its performance. We examine preprocessing or data-transformation strategies, algorithmic modifications, and hybrid approaches for robustifying Random Forests. Across these approaches, data transformation emerges as the most effective strategy, delivering the strongest performance under contamination. This strategy is simple, general, and transferable to other Machine Learning methods, offering a remedy for robust genomic prediction. In real breeding data, robust Random Forests are useful when substantial contamination, phenotypic corruption, misrecording, or train-deployment mismatch is plausible and the goal is to recover a latent signal for genomic prediction and selection; ranking-based robust Random Forests are the dependable first option, whereas weighting-based Random Forests should be used only when their weighting scheme preserves rank structure and improves prediction. Robustification is not universally necessary, but it becomes important when contamination distorts the link between observed responses and the predictive target; standard Random Forests remain the default for clean data, whereas robust Random Forests should be fitted alongside them whenever contamination is plausible, with the final choice guided by data, trait, and breeding objective. Author summaryMachine learning (ML) methods are widely used for prediction with high-dimensional, complex data, and supervised approaches such as Random Forests (RF) have proved effective for genomic prediction (GP) and selection. Yet their performance can be severely compromised by data contamination if the algorithms rely on classical data-driven procedures that are sensitive to atypical observations. Robustifying ML methods is therefore important both for improving predictive performance under contamination and for guiding their practical use in high-dimensional prediction problems. To address this need, we develop robust preprocessing, algorithm-level, and hybrid strategies for improving RF performance with contaminated data. Using simulated animal data, we show that ranking-and weighting-based robust RF provide the strongest overall compromise for genomic prediction and selection under contamination. Validation on several plant and animal breeding datasets further shows that the benefits of robustification are not universal, but depend on the dataset, trait, and breeding objective. Although motivated by RF, the framework we propose is general, practical, and readily transferable to other ML methods. It also offers a basis for deciding when robustness should complement standard RF rather than replace it outright.

17
A Fluorescent Dauer Marker in Caenorhabditis inopinata Enables Comparative Analysis of Dauer-Inducing Mechanisms

Iitsuka, R.; Haruta, N.; Oomura, S.; Sugimoto, A.

2026-04-09 developmental biology 10.64898/2026.04.06.716796 medRxiv
Top 0.7%
1.0%
Show abstract

Dauer larvae are a dormant developmental stage in nematodes that is induced by a range of environmental cues. The molecular mechanisms that transduce these cues to regulate dauer entry have been well characterized in Caenorhabditis elegans, whereas those in other nematode species remain unclear. The closest known sibling species of C. elegans, Caenorhabditis inopinata, occupies a distinct ecological niche and shows an extremely low frequency of dauer formation by starvation in laboratory conditions, suggesting that it could serve as a useful comparative model for analyzing dauer-inducing mechanisms. To support such analysis, we generated a fluorescent dauer reporter, Cin-col-183p::mCherry, in C. inopinata based on a previously reported dauer-specific reporter in C. elegans. This reporter showed fluorescence specifically in the pre-dauer and dauer stages, but not in other developmental stages, indicating that it functions as a dauer-specific marker in C. inopinata. Using these marker strains, we compared the responses to high temperature and RNAi-mediated knockdown of insulin/IGF-1 pathway genes (daf-2, age-1, and pdk-1), and found that dauer induction differs mechanistically between C. elegans and C. inopinata. This dauer-specific fluorescent strain will be a useful tool for investigating the diversity of dauer-inducing mechanisms across nematode species. Article SummaryDauer is a dormant developmental stage in nematodes induced by environmental stress. Although its regulation is well studied in Caenorhabditis elegans, the mechanisms in other species remain unclear. Here, we developed a fluorescent dauer reporter, Cin-col-183p::mCherry, in Caenorhabditis inopinata, a close relative of C. elegans. The reporter was specifically expressed in pre-dauer and dauer stages, confirming its usefulness as a dauer marker. Using this strain, we found that responses to high temperature and insulin/IGF-1 pathway gene knockdown differ between C. elegans and C. inopinata. This reporter will help reveal diversity in dauer-inducing mechanisms across nematode species.

18
Genomic sampling and population structure of farmer-maintained varieties reveal previously uncharacterized diversity of Theobroma cacao L. in Costa Rica

Herrighty, E. M.; Specht, C. D.; Gore, M. A.; Solano, L.; Estrada-Gamboa, J.; Hernandez, C. E.; Tufan, H. A.; Landis, J. B.

2026-04-01 genomics 10.64898/2026.03.30.715340 medRxiv
Top 0.7%
0.9%
Show abstract

Understanding crop genetic diversity is essential for conservation and breeding, yet farmer-maintained germplasm remains largely underrepresented in genomic studies. Theobroma cacao L. has a complex domestication history and extensive global diversity, and cacao currently cultivated in Central America, particularly in Costa Rica, has been understudied compared to South American and Mexican cultivars despite cultural and historical importance. In this study, we investigate the genetic diversity of cacao from farmer-managed systems across Costa Rica to search for Criollo germplasm and identify and characterize any unique local genetic groups. Ninety-four trees were sampled from 17 farms across four regions of the country and sequenced using whole genome resequencing. Farmer materials were analyzed alongside 166 previously characterized reference accessions representing major cacao genetic groups. Population structure analyses, phylogenetic reconstruction, and network approaches revealed that Costa Rican cacao encompasses multiple known genetic groups, including Criollo-derived lineages, while also harboring locally distinct diversity not fully represented in current global reference collections. Analyses revealed close kinship between many accessions with no clear geographic patterns corresponding to the observed population differentiation, reflecting the effects of farmers in creating dominant patterns of gene flow through seed-saving, clonal propagation, and sharing genotypes among farms. Heterozygosity levels varied substantially among individuals, consistent with a mixture of highly inbred Criollo trees and more heterozygous, admixed genotypes. We find that farmer-managed cacao systems are reservoirs of genetic diversity, including possibly rare or historically important lineages, underscoring the value of these farming systems for effective conservation and management of genomic resources for cacao resilience and improvement.

19
Seasonal fluctuations in fitness result in severe reductions in effective population size

Johnson, O. L.; Tobler, R.; Schmidt, J. M.; Huber, C. D.

2026-04-01 evolutionary biology 10.64898/2026.03.30.715388 medRxiv
Top 0.7%
0.9%
Show abstract

Genetic evidence for fluctuating selection has begun to accumulate for different species over the past few decades, especially for the Drosophila genus where studies have reported hundreds of loci undergoing putatively adaptive oscillations across successive seasons. However, most theoretical and simulation studies of fluctuating selection have relied on abstract or weakly parameterized models, making it difficult to assess their relevance for natural populations. In this study, we simulate multilocus seasonally fluctuating selection under a recently developed model and examine its effect on the variance effective population size (Ne) at a genome-wide scale. By recapitulating genomic, demographic, and evolutionary parameters from natural Drosophila populations in our simulations, we were able to reproduce allele frequency oscillations reported in recent studies and show that these lead to [~]50% genome-wide reductions in Ne. We also demonstrate that Ne reductions are well predicted by the maximum frequency amplitude among all adaptively fluctuating loci, and that the frequency amplitudes are largely determined by the number of adaptively fluctuating loci and the strength of their epistatic interactions. Our results demonstrate that fluctuating selection can substantially reduce effective population size and underscore the importance of temporally variable selection in shaping genome-wide patterns of variation beyond classical models. Article SummaryGenetic studies of fluctuating selection in natural populations have grown steadily over the past decade, with reports suggesting that hundreds of loci undergo adaptive oscillations over seasonal timescales in cosmopolitan Drosophila populations. By simulating seasonally fluctuating selection under a recently developed model and ecological scenarios informed by published studies, the authors show that this mode of selection can reduce effective population size by [~]50%, with the magnitude of the reduction correlated with the locus exhibiting the largest allele frequency fluctuations. These findings highlight fluctuating selection as an important factor shaping genome-wide patterns of genetic variation and effective population size.

20
A confined gene drive for population modification in the malaria vector Anopheles stephensi

Xu, X.; Liu, Y.; Jia, X.; Yang, J.; Xia, Y.; Chen, J.; Champer, J.

2026-04-03 genetics 10.64898/2026.04.01.715791 medRxiv
Top 0.8%
0.9%
Show abstract

Gene drives are genetic elements that bias their own inheritance to spread desired traits in target populations, enabling population modification or suppression. Although homing-based drives can propagate efficiently, their potential for uncontrolled spread may present a challenge for field deployment. Thus, confined drive systems are needed. Here, we developed a confined modification drive, called Toxin-Antidote Recessive Embryo (TARE) drive, in the globally important malaria vector Anopheles stephensi. This drive works by cleaving and disrupting wild-type alleles in the germline or early embryo from maternally deposited Cas9. Disrupted alleles are recessive lethal, thus increasing the drive in a frequency-dependent manner. Inheritance bias was moderate in crosses between drive heterozygote mosquitoes, possibly due to low gRNA activity and thus moderate germline cleavage rates. Single-release cage trials confirmed the TARE drives ability to spread, although the drive ultimately declined due to fitness costs and resistance alleles associated with repetitive elements. Nonetheless our modelling analysis indicate that this TARE system could achieve population spread if the resistance issue is addressed. These findings demonstrate a functional prototype TARE drive in Anopheles stephensi and highlight key parameters governing its performance. Minor design optimizations could substantially improve efficiency and integrity, enabling rapid but confined population modification.